<html>
<head>
<title>Delta</title>
</head>
<body bgcolor=white>

<p><b>Maintainer:</b> <a href="http://danielwilkerson.com/">Daniel S. Wilkerson</a>
<br><b>Developer:</b> <a href="http://www.cs.berkeley.edu/~smcpeak/">Scott McPeak</a>
<br><b>Documenter:</b> <a href="http://http.cs.berkeley.edu/~sfg/">Simon Goldsmith</a>

<h3>Summary</h3>

<p>Delta assists you in minimizing "interesting" files subject to a
test of their interestingness.  A common such situation is when
attempting to isolate a small failure-inducing substring of a large
input that causes your program to exhibit a bug.

<p>Our implementation is based on the <a
href=http://www.st.cs.uni-sb.de/dd/>Delta Debugging algorithm</a>.  <a
href=http://www.whyprogramsfail.com/>Andreas</a> wrote a book <a
href=http://www.whyprogramsfail.com/>"Why Programs Fail"</a> about
debugging programs.

<p>This work was supported by professors <a
href=http://theory.stanford.edu/~aiken/>Alex Aiken</a> and <a
href=http://www.cs.berkeley.edu/~necula/>George Necula</a> and was
done at UC Berkeley.

<h3>Releases</h3>

<p>Feel free to just get the current Subversion repository version as
a guest user.

<ul>

<li><a
href="http://delta.tigris.org/files/documents/3103/33566/delta-2006.08.03.tar.gz">delta-2006.08.03.tar.gz</a>
&nbsp; md5:
<code>
7be4ac4ae9c1eb01ccf29d413d4cc64a
</code>
This is the current recommended release.

<li><a
href="http://delta.tigris.org/files/documents/3103/33236/delta-2006.07.15.tar.gz">delta-2006.07.15.tar.gz</a>
&nbsp; md5:
<code>
57afa6d4e7d15f380803e878c24678ed
</code>
This release had a version that multidelta did not work without the -cpp flag.

<li><a
href="http://delta.tigris.org/files/documents/3103/25616/delta-2005.09.13.tar.gz">delta-2005.09.13.tar.gz</a>
&nbsp; md5:
<code>
588d65056ea48ae2a2ecee32598c5837
</code>
This version had a bug that it was too hard to stop delta once it
started.

<li>The first release of Delta was on 14 July 2003.  It is now
retired.

</ul>

<h2>History</h2>

<p>I first wrote delta out of necessity.  Scott and I had a
quarter-million line (after prepossessing) C++ input that would crash
the C++ front-end we were working on, <a
href=http://www.cs.berkeley.edu/~smcpeak/elkhound/sources/elsa/>Elsa</a>;
there was just no way we were going to minimize that by hand.  I just
checked my little perl script into the CVS repository most of us used
in the Programming Languages group at Berkeley.  Scott later added to
it.

<p>I would hear occasionally of someone else in the group using it:
just a "Hey, thanks for delta!" every once in a while.  One day, a
graduate student came into my office; he leaned back against the wall
and spoke in a low tone.  "You know, some of our friends at Microsoft
Research have heard of your delta tool.  They would really like to use
it.  They asked me to ask you if you could please <em>release it as
Open Source</em>; the BSD license is preferred."

<p>I GPL'ed it right away.  A few years later I relented and it is now
released under the BSD license.  So there you go, you have Microsoft
to thank for the existence of delta as an open-source project.  I
think it is quite interesting given their often unfriendly public
stance on Open Source, such as that it is <a
href=http://www.theregister.co.uk/2000/07/31/ms_ballmer_linux_is_communism/>Communism</a>.
I guess they want to commune with us after all.

<p>I presented Delta at <a href="http://www.codecon.org/2006/">CodeCon
2006</a>.  The slides are <a href=delta_codecon.ppt>here</a>.

<h2>Usage overview</h2>

<p>The best way to understand how to use delta is with an example of
its usage.  <a href=#example>Below is one example</a> helpfully
written up for me by Simon Goldsmith; read it first.  For those
wanting more, I also wrote a more detailed and harder to read document
describing each tool: <a href=using_delta.html>Using Delta</a>.

<p>Note that what follows is an example of using delta to minimize an
input file to a program that reads programs, much as a compiler does.
Note two features of file minimization that are present in the
example.

<h3>Do a controlled experiment.</h3>

<p>Below we don't just minimize a file that causes Oink to produce an
error message, we minimize a file that causes gcc to accept AND oink
to reject in a specific way.  That is, the test delta does is a
controlled experiment, where gcc is the control.  Ignoring this aspect
of the problem seems to be a frequent mistake of first time users.

<h3>Exploit nested structure.</h3>

<p>One may minimize files of simpler syntax than C++ but really all
files are interesting in the first place because they are in some
language or another.  Some simple configuration files are literally
just a list of lines but most languages have some nested structure.
Multidelta filters the input through the topformflat utility
(included) to suppress any newlines past a particular nesting depth;
this "explains" the nesting structure to the otherwise line-oriented
delta utility (a brilliantly simple idea of Scott McPeak's).  If your
input file language has no nesting structure, you can hack on
multidelta to remove the filtration through topformflat or just use
the raw delta program.  If your language has a different nesting
structure than C/C++, you can write your own multidelta and substitute
it.  A simple flex program should suffice; it need not be terribly
accurate for delta to do well.

<a name=example>
<h2>An example</h2>
</a>

<p>Simon Goldsmith 8 April / 12 Sept, 2005.

<p>Note that this example is edited for simplicity from the raw
output; we sincerely hope we did not introduce any bugs.

<h3>Setup</h3>

<p>(1) Make a new directory and copy the file to be minimized there.
<pre>
% mkdir deltaexample
% cd deltaexample/
% cp ../nsCSSDataBlock-23801-1112390043.cpp.g.ii ./foo.ii
% chmod +w foo.ii
</pre>

<p>(2) (optional) Put a read-only backup copy of the file in, say, orig/ .
<pre>
% mkdir orig
% cp foo.ii orig/
% chmod -R a-w orig
</pre>

<h3>Define interestingness</h3>

<p>(3) Write a script (do not call it 'test' as that is a system
utility program) to test the interestingness of the file, as we do
below.

<p>Note that for this example, "interesting" means the file passes gcc
but fails oink with a particular error message.  That is, if 1) gcc
accepts, and 2) oink rejects with the desired error message, then we
return zero (meaning "interesting").  If anything else happens then we
return a nonzero exit code (meaning "not interesting")

<p>Some reminders about shell: a zero exit code means "true"; so for
the purposes of &&, a zero exit code means "keep going" and grep
returns 0 if it matches, nonzero if not.  We redirect output to
/dev/null because the output of delta is noisy.  Be careful of quoting
hell: notice that we've used '.' to match characters like single
quote.

<pre>
% cat > test1.sh
#!/bin/bash

FILE=foo.ii
OINK=/home/simon/oink_all/oink/oink
GCC=/usr/bin/gcc

$GCC -c $FILE -o /dev/null &> /dev/null && $OINK $FILE | \
  grep 'error: cannot convert argument type .class .* const &. ' \
  'to receiver parameter type' \
  &> /dev/null
^D
</pre>

<p>(4) Make the script executable and run it on the file -- make sure
it returns 0.  Optionally turn off the redirection to /dev/null
temporarily to check the error message that is being found by the
grep.
<pre>
% chmod +x test1.sh
% ./test1.sh foo.ii ; echo $?
0
</pre>

<h3>Minimize automatically</h3>

<p>(5) Run multidelta with the script on the file several times at, say, levels 0 0 1 1 2 2 10.
<pre>
% multidelta -level=0 ./test1.sh foo.ii
(check email)
% multidelta -level=0 ./test1.sh foo.ii
(read slashdot)
% multidelta -level=1 ./test1.sh foo.ii
% multidelta -level=1 ./test1.sh foo.ii
% multidelta -level=2 ./test1.sh foo.ii
% multidelta -level=2 ./test1.sh foo.ii
% multidelta -level=10 ./test1.sh foo.ii
% multidelta -level=10 ./test1.sh foo.ii
</pre>

<p>(6) The input file will be modified in place and you should be left with something smaller.
<pre>
[simon@otter][deltaexample]$ ls -l
total 116
-rw-r--r--  1 simon simon  8451 Sep 12 17:10 foo.ii
-rw-r--r--  1 simon simon  8948 Sep 12 17:10 foo.ii.bak
-rw-r--r--  1 simon simon  8451 Sep 12 17:10 foo.ii.ok
-rw-r--r--  1 simon simon 57739 Sep 12 17:10 log
-rw-r--r--  1 simon simon  2752 Sep 12 17:10 multidelta.log
dr-xr-xr-x  2 simon simon  4096 Sep 12 17:16 orig/
-rwxr-xr-x  1 simon simon   385 Sep 12 16:36 test1.sh*
-rw-r--r--  1 simon simon    11 Sep 12 16:02 test1.sh~

[simon@otter][deltaexample]$ ls -l orig/
total 552
-r--r--r--  1 simon simon 558970 Sep 12 16:00 foo.ii
</pre>

<h3>Minimize further by hand</h3>

<p>(7) Hack on foo.ii by hand, re-running test1.sh each time to check it
is still "interesting".  Sometimes it helps to hack on foo.ii a little
to get delta unstuck and then rerun delta again.  You might want to
run indent as well whenever you stop to look at foo.ii as topformflat
makes a mess.

<p>Final file:
<pre>
class A {};
int main() {
  const A *val;
  val->~A ();
}
</pre>

<P>Note that the original file was about 560 KB!

<h2>Endorsements</h2>

<p>This section is just for fun because I've never had a tool so
widely used before.

<pre>
Subject: Thanks for Delta
   Date: Thu, 21 Jul 2005 21:13:20 -0700
   From: Flash Sheridan
     To: Daniel S. Wilkerson

This is just a quick thank-you note for Delta.  Andrew Pinski pointed
me towards it after filing a GCC bug with a very long source file; it
immediately reduced a different bug file from 16K lines to ten (GCC
bug 22604).  Oddly enough, it initially found a different bug (22603),
since I'd only specified "internal compiler error", not "segmentation
fault".  I might not have been able to file either of these bugs
without Delta, since the code was proprietary, but the two
Delta-reduced files were small enough to make public.

<hr align=left width=10%>
Subject: Re: Thanks for Delta
   Date: Tue, 13 Sep 2005 10:56:10 -0700
   From: Flash Sheridan
     To: Daniel S. Wilkerson

Delta has become even more valuable since my initial thank-you note.
I'm not sure it's helped with all of the GCC bugs I've been filing
(I've been tracking them at
http://pobox.com/~flash/FlashsOpenSourceBugReports.html), but I
couldn't have filed most of them without Delta.  Typically I find a
bug when GCC is compiling a large, confidential file, which I couldn't
post to Bugzilla.  Delta has always been able to find a radically
smaller file, which I have been able to attach to my bug report.

<hr align=left width=10%>
Date: Sun, 23 Oct 2005 22:01:42 +0200 (CEST)
From: Richard Guenther
  To: Daniel S. Wilkerson

> > > Thanks for your interest in Delta.  I would be interested to
> > > hear more about what you are doing with it.  If it is something
> > > I can put in the endoresements section ("Delta saved my
> > > daugher's life!") that would be great.
> >
> > Well, delta is saving a lot of gcc developers life ;) I would
> > guess 1 of 3 bugs sumitted to the gcc bugzilla get their testcase
> > reduced using delta.
> 
> Holy moly!  Can I quote that publicly such as on my web page?

Yes - a little bit more accurate would be to say we're using delta to
reduce all testcases from the gcc bugzilla in case they get entered
unreduced.
</pre>

<hr align=left width=10%>

<p>Delta (both the algorithm and this tool) has been used in the <a
href=http://www-inst.eecs.berkeley.edu/~cs169/fa05/index.shtml>Cal
Berkeley (CS169)</a> and <a
href=http://www.stanford.edu/class/cs295/>Stanford (CS295)</a> in
software engineering classes.

<pre>
Subject: Re: delta debugging on instructional machines?
   Date: Mon, 24 Oct 2005 22:55:08 -0700
   From: Gilad Arnold <arnold@eecs.berkeley.edu>
     To: Daniel S. Wilkerson

We've just assigned a delta-related homework to the students today,
. . .  We do hope that we can actually convince the students to use
delta in the course of their project development, but time will
tell. And thanks again!

<hr align=left width=10%>
Subject: Re: use of delta in your class
   Date: Mon, 24 Oct 2005 13:57:22 -0700
   From: Alex Aiken
     To: Daniel S. Wilkerson

Yes, I gave them a homework assignment for CS295 using delta.  Feedback 
was positive but unquantified.
</pre>

<h2>&nbsp;</h2>

</body>
</html>
